MoveNetSinglePoseLighting
This model is a convolutional neural network model that runs on RGB images and predicts human joint locations of a single person. (edges are available in org.jetbrains.kotlinx.dl.onnx.inference.posedetection.edgeKeyPointsPairs and keypoints are in org.jetbrains.kotlinx.dl.onnx.inference.posedetection.keyPoints).
Model architecture: MobileNetV2 image feature extractor with Feature Pyramid Network decoder (to stride of 4) followed by CenterNet prediction heads with custom post-processing logics. Lightning uses depth multiplier 1.0.
The model have an input tensor with type INT32 and shape [1, 192, 192, 3]
.
The model has 1 output:
output_0 tensor with type FLOAT32 and shape
[1, 1, 17, 3]
with 17 rows related to the following keypoints[nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left hip, right hip, left knee, right knee, left ankle, right ankle]
. Each row contains 3 numbers:[y, x, confidence_score]
normalized in[0.0, 1.0]
range.